#long-horizon RL09/10/2025
AgentFlow: Planner-Only RL and Flow-GRPO for Modular, Tool-Using Agents
'AgentFlow introduces a modular Planner–Executor–Verifier–Generator architecture and Flow-GRPO, a token-level on-policy method that trains only the Planner, reporting substantial gains across ten benchmarks and an open-source MIT implementation.'